386 research outputs found

    Superpixel-based Semantic Segmentation Trained by Statistical Process Control

    Full text link
    Semantic segmentation, like other fields of computer vision, has seen a remarkable performance advance by the use of deep convolution neural networks. However, considering that neighboring pixels are heavily dependent on each other, both learning and testing of these methods have a lot of redundant operations. To resolve this problem, the proposed network is trained and tested with only 0.37% of total pixels by superpixel-based sampling and largely reduced the complexity of upsampling calculation. The hypercolumn feature maps are constructed by pyramid module in combination with the convolution layers of the base network. Since the proposed method uses a very small number of sampled pixels, the end-to-end learning of the entire network is difficult with a common learning rate for all the layers. In order to resolve this problem, the learning rate after sampling is controlled by statistical process control (SPC) of gradients in each layer. The proposed method performs better than or equal to the conventional methods that use much more samples on Pascal Context, SUN-RGBD dataset.Comment: Accepted in British Machine Vision Conference (BMVC), 201

    Frontal top-down signals increase coupling of auditory low-frequency oscillations to continuous speech in human listeners

    Get PDF
    Humans show a remarkable ability to understand continuous speech even under adverse listening conditions. This ability critically relies on dynamically updated predictions of incoming sensory information, but exactly how top-down predictions improve speech processing is still unclear. Brain oscillations are a likely mechanism for these top-down predictions [1 and 2]. Quasi-rhythmic components in speech are known to entrain low-frequency oscillations in auditory areas [3 and 4], and this entrainment increases with intelligibility [5]. We hypothesize that top-down signals from frontal brain areas causally modulate the phase of brain oscillations in auditory cortex. We use magnetoencephalography (MEG) to monitor brain oscillations in 22 participants during continuous speech perception. We characterize prominent spectral components of speech-brain coupling in auditory cortex and use causal connectivity analysis (transfer entropy) to identify the top-down signals driving this coupling more strongly during intelligible speech than during unintelligible speech. We report three main findings. First, frontal and motor cortices significantly modulate the phase of speech-coupled low-frequency oscillations in auditory cortex, and this effect depends on intelligibility of speech. Second, top-down signals are significantly stronger for left auditory cortex than for right auditory cortex. Third, speech-auditory cortex coupling is enhanced as a function of stronger top-down signals. Together, our results suggest that low-frequency brain oscillations play a role in implementing predictive top-down control during continuous speech perception and that top-down control is largely directed at left auditory cortex. This suggests a close relationship between (left-lateralized) speech production areas and the implementation of top-down control in continuous speech perception

    Predictive entrainment of natural speech through two fronto-motor top-down channels

    Get PDF
    Natural communication between interlocutors is enabled by the ability to predict upcoming speech in a given context. Previously we showed that these predictions rely on a fronto-motor top-down control of low-frequency oscillations in auditory-temporal brain areas that track intelligible speech. However, a comprehensive spatio-temporal characterisation of this effect is still missing. Here, we applied transfer entropy to source-localised MEG data during continuous speech perception. First, at low frequencies (1–4 Hz, brain delta phase to speech delta phase), predictive effects start in left fronto-motor regions and progress to right temporal regions. Second, at higher frequencies (14–18 Hz, brain beta power to speech delta phase), predictive patterns show a transition from left inferior frontal gyrus via left precentral gyrus to left primary auditory areas. Our results suggest a progression of prediction processes from higher-order to early sensory areas in at least two different frequency channels

    Validation of Yoon's Critical Thinking Disposition Instrument

    Get PDF
    SummaryPurposeThe lack of reliable and valid evaluation tools targeting Korean nursing students' critical thinking (CT) abilities has been reported as one of the barriers to instructing and evaluating students in undergraduate programs. Yoon's Critical Thinking Disposition (YCTD) instrument was developed for Korean nursing students, but few studies have assessed its validity. This study aimed to validate the YCTD. Specifically, the YCTD was assessed to identify its cross-sectional and longitudinal measurement invariance.MethodsThis was a validation study in which a cross-sectional and longitudinal (prenursing and postnursing practicum) survey was used to validate the YCTD using 345 nursing students at three universities in Seoul, Korea. The participants' CT abilities were assessed using the YCTD before and after completing an established pediatric nursing practicum. The validity of the YCTD was estimated and then group invariance test using multigroup confirmatory factor analysis was performed to confirm the measurement compatibility of multigroups.ResultsA test of the seven-factor model showed that the YCTD demonstrated good construct validity. Multigroup confirmatory factor analysis findings for the measurement invariance suggested that this model structure demonstrated strong invariance between groups (i.e., configural, factor loading, and intercept combined) but weak invariance within a group (i.e., configural and factor loading combined).ConclusionsIn general, traditional methods for assessing instrument validity have been less than thorough. In this study, multigroup confirmatory factor analysis using cross-sectional and longitudinal measurement data allowed validation of the YCTD. This study concluded that the YCTD can be used for evaluating Korean nursing students' CT abilities

    Test-time Adaptation vs. Training-time Generalization: A Case Study in Human Instance Segmentation using Keypoints Estimation

    Full text link
    We consider the problem of improving the human instance segmentation mask quality for a given test image using keypoints estimation. We compare two alternative approaches. The first approach is a test-time adaptation (TTA) method, where we allow test-time modification of the segmentation network's weights using a single unlabeled test image. In this approach, we do not assume test-time access to the labeled source dataset. More specifically, our TTA method consists of using the keypoints estimates as pseudo labels and backpropagating them to adjust the backbone weights. The second approach is a training-time generalization (TTG) method, where we permit offline access to the labeled source dataset but not the test-time modification of weights. Furthermore, we do not assume the availability of any images from or knowledge about the target domain. Our TTG method consists of augmenting the backbone features with those generated by the keypoints head and feeding the aggregate vector to the mask head. Through a comprehensive set of ablations, we evaluate both approaches and identify several factors limiting the TTA gains. In particular, we show that in the absence of a significant domain shift, TTA may hurt and TTG show only a small gain in performance, whereas for a large domain shift, TTA gains are smaller and dependent on the heuristics used, while TTG gains are larger and robust to architectural choices

    MEGAN: Mixture of Experts of Generative Adversarial Networks for Multimodal Image Generation

    Full text link
    Recently, generative adversarial networks (GANs) have shown promising performance in generating realistic images. However, they often struggle in learning complex underlying modalities in a given dataset, resulting in poor-quality generated images. To mitigate this problem, we present a novel approach called mixture of experts GAN (MEGAN), an ensemble approach of multiple generator networks. Each generator network in MEGAN specializes in generating images with a particular subset of modalities, e.g., an image class. Instead of incorporating a separate step of handcrafted clustering of multiple modalities, our proposed model is trained through an end-to-end learning of multiple generators via gating networks, which is responsible for choosing the appropriate generator network for a given condition. We adopt the categorical reparameterization trick for a categorical decision to be made in selecting a generator while maintaining the flow of the gradients. We demonstrate that individual generators learn different and salient subparts of the data and achieve a multiscale structural similarity (MS-SSIM) score of 0.2470 for CelebA and a competitive unsupervised inception score of 8.33 in CIFAR-10.Comment: 27th International Joint Conference on Artificial Intelligence (IJCAI 2018

    Gating of memory encoding of time-delayed cross-frequency MEG networks revealed by graph filtration based on persistent homology

    Get PDF
    To explain gating of memory encoding, magnetoencephalography (MEG) was analyzed over multi-regional network of negative correlations between alpha band power during cue (cue-alpha) and gamma band power during item presentation (item-gamma) in Remember (R) and No-remember (NR) condition. Persistent homology with graph filtration on alpha-gamma correlation disclosed topological invariants to explain memory gating. Instruction compliance (R-hits minus NR-hits) was significantly related to negative coupling between the left superior occipital (cue-alpha) and the left dorsolateral superior frontal gyri (item-gamma) on permutation test, where the coupling was stronger in R than NR. In good memory performers (R-hits minus false alarm), the coupling was stronger in R than NR between the right posterior cingulate (cue-alpha) and the left fusiform gyri (item-gamma). Gating of memory encoding was dictated by inter-regional negative alpha-gamma coupling. Our graph filtration over MEG network revealed these inter-regional time-delayed cross-frequency connectivity serve gating of memory encoding
    • …
    corecore